Prediction of Martensite Start Temperature in Steels Using Cross Validation
Mounika Chevva, Hooman Sabarou(Advisor: Dr. Samantha Seals)
2024-11-17
Introduction
Cross-validation Overview
A statistical technique for evaluating the performance and generalizability of machine learning models.
Divides dataset into training and validation subsets.
Ensures model training on one subset and validation on another.
Advantages:
Provides more reliable estimates of model performance.
Reduces bias compared to a single train-test split.
Improves model generalizability by leveraging different training and validation data.
Methods
K-Fold Cross-Validation: Dataset is split into k folds, model is trained on k-1 folds and validated on the remaining fold, repeated k times (Kohavi1995?).
Leave-One-Out Cross-Validation (LOOCV): A special case of K-Fold where k equals the number of observations, each sample serves as the validation set once.
Nested Cross-Validation: Used for model selection and hyperparameter tuning, an outer loop for validation and an inner loop for training and hyperparameter optimization.
Model Measures of Error (MOE)
- Definition: Measures of Error (MOE) quantify the difference between predicted values and actual outcomes, helping assess model performance.
\text{MAE} = \frac{1}{n} \sum_{i=1}^{n} |y_i - \hat{y}_i|
where y_i is the actual value, \hat{y}_i is the predicted value,and n is the total number of observations.
- The square root of the MSE, providing error in the same units as the target variable (chai2014root?).
RMSE = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2}
where y_i is the observed value, \hat{y}_i is the predicted value,and n is the total number of observations.
- Represents the proportion of variance in the dependent variable that can be explained by the independent variables (draper1998applied?).
R^2 = 1 - \frac{\sum_{i=1}^{n} (y_i - \hat{y}_i)^2}{\sum_{i=1}^{n} (y_i - \bar{y})^2}
where \bar{y} is the mean of the actual values.
Introduction to the Dataset
Martensite Starting Temperature
- Materials Science Dataset about Steel
- Martensite Starting Temperature (Ms in degree Celsius) & chemical elements (weight percent)
- Depending on the chemistry of a steel, Ms changes
- It is important as it controls strength of Steel
- The data has 16 variables for 1543 observations
Application
Data Exploration and Visualization
In our study, we analyzed a dataset from (wentzien2024machine?) Martensite dataset focuses on predicting the Martensite Start Temperature (Ms) in steel alloys based on their chemical compositions.
- Martensite start temperature (Ms) is target variable.
- “C”,“Mn”,“Si”,“Cr”,“Ni” are Predictor variables
![]()
Correlation_Matrix
Modeling and Results
Linear Regression Model
Linear regression is a fundamental statistical technique that establishes a relationship between a dependent variable and one or more independent variables by fitting a linear equation to observed data.
In our dataset, which focuses on predicting the Martensite Start Temperature (Ms) of steel based on its chemical composition (C, Mn, Si, Cr, Ni), linear regression allows us to quantify how changes in these elements influence Ms.
M_s = \beta_0 +\beta_1 C +\beta_2 Mn + \beta_3 Si + \beta_4 Cr + \beta_5 Ni
M_s = 746.99 - 254.85 C - 24.24 Mn - 13.28 Si - 7.8 Cr - 14.64 Ni
Linear Regression Model Coefficients
| (Intercept) |
746.99268 |
4.0289613 |
185.405771 |
0.0000000 |
| C |
-254.85890 |
5.7347802 |
-44.440919 |
0.0000000 |
| Mn |
-24.24356 |
2.5175491 |
-9.629826 |
0.0000000 |
| Si |
-13.28195 |
3.6933099 |
-3.596218 |
0.0003357 |
| Cr |
-7.82620 |
0.7366216 |
-10.624451 |
0.0000000 |
| Ni |
-14.64102 |
0.2895086 |
-50.571976 |
0.0000000 |
Linear Regression Coefficients
Statistics
Residual standard error: 54.28 on 1230 degrees of freedom
Multiple R-squared: 0.7433,
Adjusted R-squared: 0.7422
F-statistic: 712.2 on 5 and 1230 DF,
p-value: < 2.2e-16
Cross Validation Results for Linear Regression
The results of this analysis reveal that the models tested with 5-Fold Cross-Validation (5-Fold CV) and Leave-One-Out Cross-Validation (LOOCV) demonstrate impressive predictive accuracy. These two models consistently outshine the Nested CV model, indicating that they are more dependable for making predictions from the dataset.
| RMSE |
48.27 |
| MAE |
32.28 |
| R2 |
0.81 |
K Fold Cross-Validation
| RMSE |
48.27 |
| MAE |
32.28 |
| R2 |
0.81 |
LOOCV Cross-Validation
| RMSE |
53.28 |
| MAE |
33.46 |
| R2 |
0.75 |
Nested Cross-Validation
Support Vector Machines (SVM) for Regression (SVR)
Support Vector Machines (SVM) are algorithms that model data by finding optimal boundaries, handling nonlinear patterns using kernels. Using our dataset, Support Vector Machines (SVM) with a radial kernel help predict the martensite start temperature (Ms) based on chemical elements like C, Mn, Ni, Si, and Cr. SVM works by finding the best way to capture the relationship between these variables, effectively handling complex patterns for accurate Ms predictions.
| RMSE |
35.93 |
| MAE |
20.98 |
| R2 |
0.90 |
SVM_K Fold Cross-Validation
| RMSE |
52.61 |
| MAE |
28.49 |
| R2 |
0.79 |
SVM_LOOCV Cross-Validation
| RMSE |
40.09 |
| MAE |
22.24 |
| R2 |
0.86 |
SVM_Nested Cross-Validation
Model Comparision Results
In this study, we compared the performance of Linear Regression and Support Vector Machine (SVM) models in predicting the martensite start temperature (Ms). Using 5-fold cross-validation, we found that SVM outperformed Linear Regression, achieving a lower MAE (~21 vs ~33), a higher R² (~0.9 vs ~0.55), and a lower RMSE (~25 vs ~48), highlighting its superior accuracy and reliability in making predictions.
| 5-Fold |
RMSE |
48.27 |
35.93 |
| 5-Fold |
MAE |
32.28 |
20.98 |
| 5-Fold |
R2 |
0.81 |
0.90 |
| LOOCV |
RMSE |
48.27 |
52.61 |
| LOOCV |
MAE |
32.28 |
28.49 |
| LOOCV |
R2 |
0.81 |
0.79 |
| Nested CV |
RMSE |
53.28 |
40.09 |
| Nested CV |
MAE |
33.46 |
22.24 |
| Nested CV |
R2 |
0.75 |
0.86 |
Model Comparision Results
Model Comparision Plot
![]()
Model Comparision Results Plot
Methodology-Hooman
- Modeling Approach:
Untransformed Model: Directly modeled Ms using predictors like C, Mn, Ni, Si, Cr, with interaction terms.
Log-Transformed Model: Modeled log(Ms) to handle non-normality and stabilize variance, using the same predictors and interaction terms.
Model Improvements (Predictors’ Removal, Introducing Interaction Parameters, Outliers’ Removal)
Model Diagnostics (ANOVA, AIC, Cross-Validation, Check for Multicollinearity, Influential Points’ Removal)
Model Evaluation: The log-transformed model showed significantly better performance with a lower AIC and cross-validation MSE. Residual deviance and cross-validation confirmed that the log model generalized better to unseen data.
- Cross-Validation Refinement:
- K-Fold Cross-Validation with More Folds
- Leave-One-Out Cross-Validation (LOOCV)
- Programing has been done by R (R Core Team 2021) in Rstudio (version 2024.04.2)
- Utilized packages: tidyverse (Wickham et al. 2019), classpackage (Buker and Seals 2024), ggplot2 (Wickham 2016), psych (William Revelle 2024), and boot A. C. Davison and D. V. Hinkley (1997)
Models
Ms = 769.41 -286.71 C -16.42 Mn -14.04 Ni - 13.89 Si - 10.13Cr -41.45C:Mn - 8.36 C:Ni
|
Variables
|
Mean ± SD
|
Correlation Coefficient
|
P-value
|
|
C
|
0.36 ± 0.1
|
-286.71
|
< 2e-16
|
|
Mn
|
0.79 ± 0.3
|
-16.42
|
1.36E-13
|
|
Ni
|
1.55 ± 0.5
|
-14.04
|
< 2e-16
|
|
Si
|
0.35 ± 0.2
|
-13.89
|
1.70E-13
|
|
Cr
|
1.04 ± 0.7
|
-10.13
|
< 2e-16
|
|
C:Mn
|
N/A
|
-41.45
|
< 2e-16
|
|
C:Ni
|
N/A
|
-8.36
|
9.68E-10
|
Models
log(Ms) = -6.69 - 0.51C - 0.03 Mn - 0.03 Ni - 0.03 Si - 0.02Cr - 0.07 C:Mn - 0.01C:Ni
|
Variables
|
Mean ± SD
|
Correlation Coefficient
|
P-value
|
|
C
|
0.36 ± 0.1
|
-0.51
|
< 2e-16
|
|
Mn
|
0.79 ± 0.3
|
-0.032
|
< 2e-16
|
|
Ni
|
1.55 ± 0.5
|
-0.0255
|
< 2e-16
|
|
Si
|
0.35 ± 0.2
|
-0.0226
|
4.48E-13
|
|
Cr
|
1.04 ± 0.7
|
-0.0175
|
< 2e-16
|
|
C:Mn
|
N/A
|
-0.0751
|
< 2e-16
|
|
C:Ni
|
N/A
|
-0.0154
|
1.01E-11
|
Conclusion: Overview
Evaluation of Two Models:
Linear Regression Model
Support Vector Machine Model
Cross-validation Methods Used:
k-fold Cross-validation
Leave-one-out Cross-validation (LOOCV)
Nested Cross-validation
Conclusion: Key Findings
Mean Absolute Error (MAE): SVM performed much better, with a lower MAE (~21) compared to Linear Regression (~33), meaning SVM’s predictions were closer to the actual values.
R-squared (R²): SVM showed a significantly higher R² (~0.9) than Linear Regression (~0.55), indicating that SVM explained 90% of the data’s variability, while Linear Regression only accounted for about 55%.
Root Mean Squared Error (RMSE): SVM had a much lower RMSE (~25) compared to Linear Regression (~48), which reflects its greater accuracy and fewer large prediction errors.
Overall Performance: Across all key metrics, SVM outperformed Linear Regression, proving to be a more accurate and reliable model for predicting martensite start temperature (Ms).